Extracting Lexico-semantic Relations from Document Titles Using Sublanguage Analysis
نویسندگان
چکیده
The sublanguage analysis methodology is based on the theory that texts generated by a group of people possess their own lexical, syntactic, and semantic characteristics. The main thrust of the research reported here is to apply the methodology to Korean document titles, reflecting the fact that there has been little research in applying the methodology for the Korean language although much research has been done with English texts. The research had the goal of finding a way to construct the lexicosemantic representation from the document titles and laying out a firm ground for automating the entire process by developing necessary components and analyzing the linguistic phenomena. We conducted intellectual analyses and extracted a number of lexico-semantic relations between concepts existing in document titles. For each relation, we generated relation-revealing patterns (RRPs). A program was written to extract relations by using RRPs as well as concepts connected by a relation. We applied a chart-parsing algorithm for the problem of determining the boundary of concepts. In our experiments with journal titles, we achieved about 80.9% of accuracy in identifying the relations and the boundary of concepts connected by a relation.
منابع مشابه
Automatic Acquisition of Lexico-semantic Knowledge for QA
We present an experiment for finding semantically similar words on the basis of a parsed corpus of Dutch text and show that the acquired information correlates with relations found in Dutch EuroWordNet. Next, we demonstrate how the acquired knowledge can be used to boost the performance of an open-domain question answering system for Dutch. Automatically acquired lexico-semantic information is ...
متن کاملExtraction of Lexico-Syntactic Information and Acquisition of Causality Schemas for Text Annotation
We present the INSYSE method for the annotation of texts, based on extraction of semantic relations from syntactic structures. We apply this method to a corpus of 5000 Medline abstracts about central nervous system diseases and gene interactions. Our cooperative approach focuses on (1) extracting lexico-syntactic information from sentences in the corpus comprising causation lexemes and (2) elab...
متن کاملAn Iterative Method of Extracting Chinese ISA Relations for Ontology Learning
Automatic acquisition of ISA relations is a basic problem in knowledge acquisition from text. We present an iterative method extracting ISA relations from large Chinese free text for ontology learning. Firstly, it initially discovers a set of sentences using several special Chinese lexico-syntactic patterns from free text corpus. Secondly we combine outside layer removal and inside layer gather...
متن کاملExtraction of Semantic Relationships from Academic Papers using Syntactic Patterns
Integrating concept and citation networks on a specific research subject can help researchers focus their own work or use methods described in prior works. In this paper, we propose a method to extract semantic relations from concepts and citation in the descriptions of related work. Specifically, we examined (i) topic-paper relations between research topics and reference papers and (ii) method...
متن کاملCombining Vector Space Model and Multi Word Term Extraction for Semantic Query Refinement
In this paper, we target document ranking in a highly technical field with the aim to approximate a ranking that is obtained through an existing ontology (knowledge structure). We test and combine symbolic and vector space models (VSM). Our symbolic approach relies on shallow NLP and on internal linguistic relations between Multi-Word Terms (MWTs). Documents are ranked based on different semant...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007